March 28, 2015

SRM 654

vexorianvexorian

DURATION

25 min

Match Summary

Round Overview
Discuss this match

SquareScoresDiv2

Problem Details

There isn’t much to do. We can simply try each of the substrings of the string, check if it is only made of a single character and in that case, increment the result.

We can simplify the logic a bit by using a set structure or similar as you iterate the characters in the substrings. If at the end, the set contains only one character, the string is correct:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
int getscore(string s) {
  int c = 0;
  int n = s.size();
  // For each substring:
  for (int i = 0; i < n; i++) {
    for (int j = i; j < n; j++) {
      // Would a set of all the characters in substring contain only
      // one distinct character?
      set < char > st;
      for (int k = i; k <= j; k++) {
        st.insert(s[k]);
      }
      if (st.size() == 1) {
        // if so, increment c:
        c++;
      }
    }
  }
  return c;
}

1
2
3
4
5
6
7
8
9
10
11
12
class SquareScoresDiv2:
  def getscore(self, s):

  def properSubstring(t):
  return len(set(t)) == 1

c = 0
for j in range(len(s) + 1):
  for i in range(j):
  c += properSubstring(s[i: j])

return c

Even the problem statement reminds us: There are `N!` ways to choose the rooms for each furniture piece. For `N = 9`, `N! = 362880`, we can use brute force, perhaps through backtracking but in the case of permutations, your language may also help, such as c++'s next_permutation or python’s itertools.

Simulation

Once we know which tree node (room) to send each furniture piece to, we need to know if this is actually possible. What we need to do is simulate the process of taking the furniture from `s` to the wanted node. To do this, we would need to know things like knowing the best path from `s` to the picked node in advance.

Overall this approach shall work but maybe we can aim towards something simpler.

Simulating as we backtrack

Imagine the first furniture enters `s`. There will be some rooms connected to `s` that we can pick to move the furniture to. Or we can just leave the furniture in there. If we decide to move the furniture piece to another room, we see a new instance of the same problem: The furniture piece is in `s’`, there are some rooms connected to room `s’`, we can choose to move it again (but avoid moving it back to `s`, that would be redundant) or we can decide to leave it there. Eventually we will decide to leave the furniture somewhere, and we can progress to making the next furniture piece enter the tree and repeat.

Sometimes you will find that it becomes impossible to move a furniture to another room, the other room is already occupied. If all rooms are occupied, the furniture piece must stay in the location. But what if even `s` is non-empty? Then we cannot place any more furniture, and the previous decisions were invalid.

However, if we were able to put all of the furniture pieces, then we reached a valid case, and we should count it up.

SquareScores

Problem Details

Linearity of expectation

We are looking for an expected value of a score. The score is equal to the number of substrings that contain the same character. It helps to be a bit mathematical and say that `f(s)` will be the score of substring `s`. If we want to imagine `f()` will be like this:

`f(s) = 1` if `s` contains only one distinct character.
`f(s) = 0` if `s` contains multiple distinct characters.

For each substring `s`, we add `f(s)`. Something like this:

`sum_(s " is a substring of S")f(s)`

That would be the final score. We are actually looking for the expected value of this score:

r = `E(sum_(s " is a substring of S")f(s))`

This is where some knowledge about the expected value. Lately, using the linearity property of the expected value has become a popular step. The expected value is defined to be linear, which means that the expected value of a sum of some random variables is equal to the sum of the expected values of each of the random variables. This is useful in some problems because we do not even need the random variables to be independent of each other for this to work. What is relevant to our problem is that `r` above can be re-defined as:

r = `sum_(s " is a substring of" S)E(f(s))`

Given a substring `s` , what is the expected value of `f(s)`? Remember the definition above, `f(s)` can be either 1 or 0. If we go by the definition of the expected value we will have:

`1 * ["Probability " f(s) " is " 1] + 0 * ["Probability " f(s) " is " 0]`

The probability `f(s)` is 1, is the probability `s` will contain only one distinct character, therefore:

`E(f(s)) = [“Probability " s " contains only one distinct character”]`

Therefore, if we add, for each substring, the probability that it contains only one distinct character, that sum would be equal to the expected value we are looking after.

Probability

For each of the `O(n^2)` substrings, calculate the probability that the substring contains only equal characters. This is where the question marks come into game. There are some different cases to consider:

In a substring like “???”, all characters are unknown, “aaaa”, “bbbb” and “tttt” are each correct ways to fill the question marks. If the probability of each letter is `p_i` and the number of question marks is `q`, then the probability to have a string of `q` letters is `(p_i)^q`. For each character `i`, calculate `(p_i)^q` and sum all results, that is the total probability the string has only one distinct character.
If the substring looked like: “a??b?”, it is already impossible for it to have only one distinct character. No matter how we fill the question marks, there will be at least 2 distinct characters. The probability is 0.
So we focus only on substrings that contain only question marks and a single distinct character: “???c?c??c”. This time there are `q` question marks, but we already know that each of them must be c. So the result would be: `(p_c)^q`

For each string, we need two things: a) `q`, the number of question marks and b) `c` the single character that appears in the substring and is already known. So implicitly we also need to verify that there is only one distinct known character. In the straightforward way, we need `O(n)` time per substring to find `q` and `c`. This adds up to an `O(n^3)` algorithm which for the constraints of `n<=1000` is too much.

Save a factor

As a last step, we need to find `q` and `c` in a way that we do not need `O(n^3)` complexity in total.

A clever approach is to try the substrings in a way that, after fixing a starting position `i`, we go through each possible ending position `j` and we update `c` and `q` as we increment the pointer `j`

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
double calcexpectation(vector < int > pp, string s) {
  int n = s.size();
  double res = 0.0;
  vector < double > p(26, 0.0);
  for (int i = 0; i < pp.size(); i++) {
    p[i] = pp[i] / 100.0;
  }
  // The start of the substring:
  for (int i = 0; i < n; i++) {
    char ch = '?';
    bool wrong = false;
    int q = 0;
    // The end of the substring:
    for (int j = i; j < n; j++) {
      // update q and ch as necessary, mark wrong cases:
      if (s[j] != '?') {
        if (ch == '?') {
          ch = s[j];
        } else if (ch != s[j]) {
          wrong = true;
        }
      } else {
        q++;
      }
      if (!wrong) {
        if (ch == '?') {
          // substring is q question marks
          // all must become the same character
          for (int i = 0; i < 26; i++) {
            res += pow(p[i], q);
          }
        } else {
          // substring contains q question marks,
          //all must become ch
          res += pow(p[ch - 'a'], q);
        }
      }
    }
  }
  return res;
}

SuccessiveSubtraction2

Problem Details

There are up to 2000 modifications of the list of numbers. For each of them, we need to find the optimal final result after adding at most 2 pairs of parenthesis. How about we first try ignoring the way each of the individual lists is generated? Assuming we can find a reasonably fast algorithm to solve a individual case we can repeat it 2000 times. Let’s aim towards an `O(n)` algorithm.

What each number contributes

A nice way to tackle the “Find the optimal way to place the parenthesis” problem is to consider the numbers in the list:

`a[0] - a[1] - a[2] - a[3] - a[4] - … - a[n]`

Depending on how we arrange the brackets, each `a[i]` will either contribute `-a[i]` or `+a[i]` to the final result. Because all rearranging the brackets can do is multiply by `-1`. There is no way to make them contribute another value. In the above example, `a[0]` is added without a negative sign, and every other element is added with the negative sign. However, if we add a pair of parenthesis:

`a[0] - (a[1] - a[2] - a[3] - a[4]) - … - a[n]`
`a[0] - a[1] + a[2] + a[3] + a[4] - … - a[n]`

Now elements `i = 2` to `i = 4` are added to the final sum instead of subtracted. We can add another pair of parenthesis:

`a[0] - (a[1] - (a[2] - a[3]) - a[4]) - … - a[n]`
`a[0] - (a[1] - a[2] + a[3] - a[4]) - … - a[n]`
`a[0] - a[1] + a[2] - a[3] + a[4] - … - a[n]`

Now the signs alternate. How curious.

Note that there is no way to change the sign of `a[0]`, and adding a parenthesis before `a[0]` effectively does nothing to change the other results. So we can ignore it from now on and focus on maximizing the remaining part of the sum.

Parenthesis expressions

A clever idea is to take a page from the way in which we check the correctness of bracket expressions. Take this example:

`-(a[1] - (a[2] - a[3]) - a[4]) - … - a[n]`

Because `a[3]` is inside two bracket expressions, we can predict its sign in the final sum will be negative. `a[4]` is only inside 1 bracket expression, so its sign is positive. We can predict the sign based on the number of open brackets before the element.

Let’s try to calculate the result of this expression online, meaning that we won’t manually solve the signs on the parenthesis. Instead we will do a single iteration from 1 to `n`.

We begin with `-(a[1]` there is a parenthesis , but in this problem parenthesis don’t change the sign of the first element. So we add `-a[1]`. There is now one open bracket.
Now we reach `- (a[2]`, once again the bracket doesn’t change the sign of the next element, so we only consider the previously open bracket, this changes the sign from negative to positive: Add `a[2]`. There are now 2 open brackets.
`-a[3])` : With two open brackets, the sign modifications cancel each other: Add `-a[3]`. Right after a parenthesis is closed, so now there is only one open bracket.
`-a[4])` : With one open bracket, the sign is canceled and we add `a[4]`. The bracket is then closed.
`… - a[n]` : For all elements from `i=5` to `i=n`, there are no brackets so we will just add `-a[i]`.

Dynamic programming

To know the value contributed by each element to the final sum you only need to know the current amount of open brackets. Sometimes you might decide to open or close new brackets, so we also need the number of remaining brackets we can open. Let’s define a function `f(p, “open”, “remaining”)`. Which calculates the sum value of all elements with index greater than or equal to `p`, such that there are currently a number of open brackets and you can also add remaining more brackets.

As a base case, when `p = n`, this means that we have already processed all the elements. If there are any open brackets, we can close them.
When open is greater than 0, we can decide to close a bracket. This makes open decrease by 1: `f(p, “open”-1, “remaining”)`.
Otherwise, we need to add `a[p]` to the sum. The sign depends on the parity of the number of open brackets. Let `x` be `a[p]` or `-a[p]` depending on what value to add. We have two options:
- If `“remaining” > 0`, we can decide to open a bracket. This won’t change the value of `x` but will change the other values, so open increases and remaining decreases: `x + f(p+1, “open”+1, “remaining”-1)`.
- We can also do nothing, and move to the next element: `x + f(p+1, “open”, “remaining”)`.
- It is unnecessary to consider the case of adding two brackets in the same position, the second bracket would be unable to change any signs.
Just pick the maximum of these values.

This recurrence is acyclic and has `O(n)` states (for each `p`, there are some pairs `(“open”, “remaining”)` such that `“open” + “remaining” <= 2`.). Let’s use iterative dynamic programming / memoization.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
vector < int > a;
int n;

const int INF = -0x80808080;
const char B_N_INF = 0x80;

int dp[1001][3][3];

int f(int p, int open, int remaining) {
  int & res = dp[p][open][remaining];
  if (res == -INF) {
    if (p == n) {
      // nothing more to do, close any parenthesis
      res = 0;
    } else {
      int x = ((open % 2 == 0) ? -a[p] : a[p]);
      if (remaining > 0) {
        // can open one
        res = std::max(res, x + f(p + 1, open + 1, remaining - 1));
      }
      if (open > 0) {
        // can close it, we close the parenthesis BEFORE the element
        res = std::max(res, f(p, open - 1, remaining));
      }
      // do nothing
      res = std::max(res, x + f(p + 1, open, remaining));
    }
  }
  return res;
}

int solve() {
  memset(dp, B_N_INF, sizeof(dp));

  return a[0] + f(1, 0, 2);
}

vector < int > calc(vector < int > a, vector < int > p, vector < int > v) {
  this - > a = a;
  n = a.size();
  vector < int > r(p.size());
  for (int i = 0; i < p.size(); i++) {
    this - > a[p[i]] = v[i];
    r[i] = solve();
  }
  return r;
}

TwoEntrances

Problem Details

The one entrance case

Before solving the case with two entrances, it would be helpful to solve the case with a single entrance. The one entrance case was the problem in the division 2 version, but in that problem the constraints were very low so you could brute-force it: here we need something faster.

The tree is originally unrooted, but it’s natural to root it at the only special node, the entrance.

Naturally, the furniture at the entrance has to be placed last. Otherwise, you’d block your only access to the house and it would be impossible to place anything else. The key to counting the possibilities is to realize that, since the entrance is going to be unblocked for the whole duration of the move, so will the path to the entrance’s children -- which makes the placing of furniture in each of the children’s subtrees independent! (In other words, nothing we do on one subtree will end up blocking a different subtree). Based on this assumption, we can calculate the number of ways to distribute the furniture by first deciding to which subtree each piece of furniture will go, then multiplying this total by the number of ways to fill each subtree. Deciding to which subtree each piece of furniture will go is illustrated in the following picture:

path5577

How do we calculate the number of ways to fill each subtree? Well, that’s the same as our original problem! So now we can already count the number of ways to fill the tree using recursion: let’s name the number of ways of placing furniture in the subtree rooted at `i` to be equal to `ways(i)`. Therefore, the number of ways to fill a tree in which `s_0` is the root and the direct children of the root are `c_0`, `c_1`, …, `c_i` is `ways(s_0) = M(s_0, size(c_0), size(c_1), …, size(c_i)) * ways(c_0) * ways(c_1) * … * ways(c_i)`, where `M(n, k_1, k_2, …, k_r)` is a multinomial coefficient (which can be expressed as a product of binomial coefficients). Memoizing this recursion would give us an `O(n^2)` algorithm to count the number of ways to fill a tree, if you count the time to calculate the binomial coefficients. This is good enough for our constraint of `n le 3000`.

The two entrance case

We can’t easily choose a node to root the tree this time, as there are two entrances. Instead, it is helpful to draw the two entrance case as a path between the entrances `s_0` and `s_1`, with a few outward branches, as follows:

text5750

For a similar reason as to the one entrance case, the last vertex must necessarily be one of the entrances if you don’t want to get stuck. Let’s say without loss of generality `s_0` is the last furniture to be placed. What happens to the nodes connected to `s_0`? As you know `s_0` is not going to be blocked for the whole duration of the process, the path to nodes connected to `s_0` will always be available. In other words, you can treat the branches that leave `s_0` as independent problems! (exactly like in the one entrance case). This now means that we can choose to which branch each piece of furniture will go, then multiply by the number of ways to fill each branch. For the outward branches, this is simply the already discussed one-entrance case; for the branch corresponding to the path to `s_2`, we have a problem with two entrances, which is the same problem we started with.

twoent

Again, this means recursion can come to the rescue. Note that the two entrances for the recursive calls are always going to be vertices in the path, so the branches are always the same. This means that if we precompute the binomial coefficients, as well as the number of ways to fill each outward branch, we can do a dynamic programming solution to finish the problem in `O(n^2)` time.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
vector < int > adj[3100];
int ways[3100];
int prevs[3100];
int C[6100][6100];
int branchSize[3100];
vector < int > path;

const int mod = 1000000007;

class TwoEntrances {
  public: int count(vector < int > a, vector < int > b, int s1, int s2);
};

void dfs(int v, int p) {
  prevs[v] = p;
  for (int x: adj[v])
    if (x != p) {
      dfs(x, v);
    }
}

void dfsBS(int v, int p) {
  branchSize[v] = 1;
  for (int x: adj[v])
    if (x != p) {
      dfsBS(x, v);
      branchSize[v] += branchSize[x];
    }
}

void calcbranchSize(int i) {
  int v = path[i];
  branchSize[v] = 1;

  int pp = (i == 0 ? -1 : path[i - 1]);
  int ps = (i + 1 == path.size() ? -1 : path[i + 1]);

  for (int x: adj[v])
    if (x != pp && x != ps) {
      dfsBS(x, v);
      branchSize[v] += branchSize[x];
    }
}

void dfsW(int v, int p) {
  ways[v] = 1;
  int rem = branchSize[v] - 1;
  for (int x: adj[v])
    if (x != p) {
      dfsW(x, v);

      ways[v] = (ways[v] * 1 ll * C[rem][branchSize[x]]) % mod;
      ways[v] = (ways[v] * 1 ll * ways[x]) % mod;
      rem -= branchSize[x];
    }
}

// Solve the one-entrance version rooted at vertex i of the path
void calcWays(int i) {
  int v = path[i];

  int pp = (i == 0 ? -1 : path[i - 1]);
  int ps = (i + 1 == path.size() ? -1 : path[i + 1]);

  ways[v] = 1;
  int rem = branchSize[v] - 1;
  for (int x: adj[v])
    if (x != pp && x != ps) {
      dfsW(x, v);

      ways[v] = (ways[v] * 1 ll * C[rem][branchSize[x]]) % mod;
      ways[v] = (ways[v] * 1 ll * ways[x]) % mod;
      rem -= branchSize[x];
    }
}

int pd[3100][3100];

// Count the number of ways to fill the two-entrance case
// The first entrance is vertex a in the path, the second entrance is vertex b
// xx is the current size of the tree, to make finding binomials easier
int rec(int a, int b, int xx) {
  if (a == b) return ways[path[a]];

  int & ret = pd[a][b];
  if (ret == -1) {
    ret = 0;

    // If we pick a as the last vertex
    // number of ways to choose which furniture goes where
    int nn = C[xx - 1][branchSize[path[a]] - 1];
    // number of ways to fill outward branches
    nn = (nn * 1 ll * ways[path[a]]) % mod;
    // number of ways to fill two-entrance part
    nn = (nn * 1 ll * rec(a + 1, b, xx - branchSize[path[a]])) % mod;
    ret = (ret + nn) % mod;

    // If we pick b as the last vertex
    // Same as previous case
    nn = C[xx - 1][branchSize[path[b]] - 1];
    nn = (nn * 1 ll * ways[path[b]]) % mod;
    nn = (nn * 1 ll * rec(a, b - 1, xx - branchSize[path[b]])) % mod;
    ret = (ret + nn) % mod;
  }

  return ret;
}

int TwoEntrances::count(vector < int > a, vector < int > b, int s1, int s2) {
  // Precompute binomial coefficients
  for (int i = 0; i <= 6050; i++) {
    C[i][0] = 1;
    for (int j = 1; j <= i; j++) {
      C[i][j] = (C[i - 1][j - 1] + C[i - 1][j]) % mod;
    }
  }

  // Build tree
  for (int i = 0; i < a.sz; i++) {
    adj[a[i]].push_back(b[i]);
    adj[b[i]].push_back(a[i]);
  }

  // Find path between the two entrances
  dfs(s1, -1);
  int cr = s2;
  while (cr != -1) {
    path.pb(cr);
    cr = prevs[cr];
  }

  // Precompute number of ways to fill each outward branch
  for (int i = 0; i < path.size(); i++) {
    calcbranchSize(i);
    calcWays(i);
  }

  // Solve the two-entrance case recursively
  memset(pd, -1, sizeof(pd));
  return rec(0, path.size() - 1, a.size() + 1);
}

Chat on Discord

March 28, 2015

SRM 654

DURATION

categories

Tags

share

COMPETITIVE PROGRAMMING AT TOPCODER

Match Summary

SquareScoresDiv2

OneEntrance

Simulation

Simulating as we backtrack

SquareScores

Linearity of expectation

Probability

Save a factor

SuccessiveSubtraction2

What each number contributes

Parenthesis expressions

Dynamic programming

TwoEntrances

The one entrance case

The two entrance case